Method and system for reduced distributed event handling in a network environment

ABSTRACT

The present disclosure details a system, apparatus and method for reducing the redundant handling of distributed network events. In one aspect, a proxy node is selected from a plurality of network nodes and an associated network management station (“NMS”) preferably addresses only the distributed events received from the proxy node. In an alternate embodiment, non-proxy nodes may be limited to reporting node-specific events to the NMS, resulting in a reduction of the number of distributed events received and processed by the NMS to those sent by the proxy node. The proxy node may be selected by the NMS or by the network nodes, in alternate implementations. Availability of the proxy node may be monitored and ensured by the network nodes or by the NMS. The selection of a proxy node is generally repeated upon the addition of nodes to the network or a lapse in proxy node availability.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is related to U.S. patent application Ser. No.09/738,960 entitled “Caching System and Method for a Network StorageSystem” by Lin-Sheng Chiou, Mike Witkowski, Hawkins Yao, Cheh-Suei Yang,and Sompong Paul Olarig, which was filed on Dec. 14, 2000 and which isincorporated herein by reference in its entirety for all purposes; U.S.patent application Ser. No. 10/015,047 entitled “System, Apparatus andMethod for Address Forwarding for a Computer Network” by Hawkins Yao,Cheh-Suei Yang, Richard Gunlock, Michael L. Witkowski, and Sompong PaulOlarig, which was filed on Oct. 26, 2001 and which is incorporatedherein by reference in its entirety for all purposes; U.S. patentapplication Ser. No. 10/039,190 entitled “Network Processor InterfaceSystem” by Sompong Paul Olarig, Mark Lyndon Oelke, and John E. Jenne,which was filed on Dec. 31, 2001, and which is incorporated herein byreference in its entirety for all purposes; U.S. patent application Ser.No. 10/039,189 entitled “Xon/Xoff Flow Control for Computer Network” byHawkins Yao, John E. Jenne, and Mark Lyndon Oelke, which was filed onDec. 31, 2001, and which is incorporated herein by reference in itsentirety for all purposes; and U.S. patent application Ser. No.10/039,184 entitled “Buffer to Buffer Flow Control for Computer Network”by John E. Jenne, Mark Lyndon Oelke and Sompong Paul Olarig, which wasfiled on Dec. 31, 2001, and which is incorporated herein by reference inits entirety for all purposes. This application is also related to thefollowing four U.S. patent applications: U.S. patent application Ser.No. 10/117,418 entitled “System and Method for Linking a Plurality ofNetwork Switches,” by Ram Ganesan Iyer, Hawkins Yao and MichaelWitkowski, filed on Apr. 5, 2002 and which is incorporated herein byreference in its entirety for all purposes; U.S. patent application Ser.No. 10/117,040 entitled “System and Method for Expansion of ComputerNetwork Switching System Without Disruption Thereof,” by Mark LyndonOelke, John E. Jenne, Sompong Paul Olarig, Gary Benedict Kotzur andMatthew John Schumacher, filed on Apr. 5, 2002 and which is incorporatedherein by reference in its entirety for all purposes; U.S. patentapplication Ser. No. 10/117,266 entitled “System and Method forGuaranteed Link Layer Flow Control,” by Hani Ajus and Chung Dai, filedon Apr. 5, 2002 and which is incorporated herein by reference in itsentirety for all purposes; U.S. patent application Ser. No. 10/117,638entitled “Fibre Channel Implementation Using Network Processors,” byHawkins Yao, Richard Gunlock and Po-Wei Tan, filed on Apr. 5, 2002 andwhich is incorporated herein by reference in its entirety for allpurposes. This application is a divisional of U.S. patent applicationSer. No. 10/117,290 entitled “Method and System For Reduced DistributedEvent Handling In A Network Environment,” by Ruotao Huang and RamGanesan Iyer, filed on Apr. 5, 2002. The contents of these applicationsare incorporated herein in their entirety by this reference.

BACKGROUND

1. Technical Field of the Invention

The present application is related to computer networks. Morespecifically, the present application is related to a system, apparatusand method for handling multiple instances of events while avoidingduplication of work in a distributed network environment.

2. Background of the Invention

The distributed nature of computer networks presents various challengesfor their centralized management. One such challenge is event or alarmmanagement and processing. In a typical network environment, distributednetwork nodes typically notify the network's central network managementserver software application of any changes in the state of the networkor of individual nodes. In general, the network management applicationor software may be run on one network server or simultaneously on aplurality of such servers. Such network management applicationstypically represent a single point, management interface for networkadministration.

Among the events or alarms typically monitored in a distributed networkare distributed and node-specific events. In general, distributed eventsare those events that may affect the network as a whole. One example ofa distributed event is the removal of a device port's entry from anassociated Distributed Name Server. Such an event is considered adistributed event because it affects the Distributed Name Server on allof the network's Fibre Channel switches, for example.

Node-specific events, on the other hand, are typically concerned onlywith the state of an individual node. One example of a node-specificevent is a FAN_FAILURE alarm. A FAN_FAILURE alarm is considered anode-specific event because it does not generally affect any nodes inthe network other than the node where it originates.

Network management difficulties arise when the same distributed event issent to the network management application by multiple nodes. If thenetwork management application handles or processes each instance of thereported event without distinguishing whether each event is a differentevent or multiple copies of the same event, the network managementapplication may suffer performance degradation resulting fromdouble-handling, i.e., the repeated processing or addressing of the sameevents. Double-handling is typically most dangerous in situations wherethe network management application handles or processes events based oncertain assumptions regarding the current state of the computer network.In other words, when the network management application receives asubsequent copy of the same event, the state of the network may havealready been changed as a result of the network management application'shandling of the previously reported event. At a minimum, double-handlingconsumes resources as the network management application attempts torepeatedly handle or process the same event.

Attempts to resolve the issue of double-handling include giving themultiple copies of the same event the same identity tag. In such animplementation, when the network management application receivesnotification of events, the network management application will begin byexamining the identity tags. By examining the identity tags, the networkmanagement application can group those events with the same identitytags together, thereby enabling the network management application tohandle or process the same event only once.

In reality, however, identity tags are impractical to implement. In oneaspect, the need for the nodes to communicate with each other to agreeon the identity tag every time they are going to send a notice of anevent results in excessive network overhead. In a further aspect, thenetwork management application generally has to keep a history of allthe received tags in order to perform tag association.

SUMMARY OF THE INVENTION

The present invention overcomes the above-identified problems as well asother shortcomings and deficiencies by providing a system, apparatus andmethod for reducing the double-handling of distributed event messages ina computer network environment. In a primary aspect of the presentinvention, distributed event handling may be reduced by maintaining theavailability of a proxy node that is responsible for reporting thedistributed events to a network management station (“NMS”).

The present invention provides the technical advantage of properlyhandling multiple instances of the same event received from the networknodes in a distributed network environment without double-handling whileat the same time being able to receive and handle events unique to eachindividual network node.

The present invention further provides technical advantages through thereduction of instances of double-handling which simultaneously reducesusage of the network's processing resources. In one embodiment, networkresource usage may be reduced by sending only one copy of eachdistributed event to the network management station (“NMS”) and itsassociated applications for processing.

A further technical advantage provided by the present invention stemsfrom the distributed event handling that is performed primarily by thenetwork management station, eliminating processing efforts from networknodes. Such elimination is invaluable when using the network managementstation to monitor and manage the networks of third-party nodes.

In another respect, the present invention provides the advantage ofreducing distributed event double-handling without consuming networkmanagement resources by pushing the elimination of redundant distributedevent messages down to the network nodes.

BRIEF DESCRIPTION OF THE DRAWINGS

A more complete understanding of the present embodiments and advantagesthereof may be acquired by referring to the following description takenin conjunction with the accompanying drawings, in which like referencenumbers indicate like features, and wherein:

FIG. 1 is a schematic drawing depicting a computer network formed inaccordance with teachings of the present invention;

FIG. 2 is a flow diagram depicting a method for reducing the repeatedhandling of distributed network events according to teaching of thepresent invention;

FIG. 3 is a schematic drawing diagram depicting an alternate embodimentof a computer network formed in accordance with teachings of the presentinvention; and

FIG. 4 is a flow diagram depicting a method for reducing distributedevent messaging through the maintenance of a proxy node by and among thenetwork nodes, according to teachings of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Preferred embodiments of the present invention and its advantages arebest understood by referring to FIGS. 1 through 4 of the drawings, likenumerals being used for like and corresponding parts of the variousdrawings.

Illustrated in FIG. 1 is an exemplary embodiment of a computer networkincorporating teachings of the present invention. The computer network,indicated generally at 100, preferably includes one or more networkmanagement stations 103. The network management station 103 may be anycomputing device capable of performing the methods described herein aswell as capable of communicating with nodes 106, 109 and 112 or the likevia communication network 115.

In one embodiment, the network management station 103 may include amonitor or display 118, a central processing unit 121, one or more userinput devices (not expressly shown). Examples of user input devicesincluded, but are not limited to, a computer mouse, keyboard,touchscreen, voice recognition hardware and software, as well as otherinput devices. The central processing unit 121 may take many forms. Forexample, the central processing unit 121 may be a mainframe computer, aserver computer a desktop computer, a laptop computer, application bladeor any other computer device capable of responding to event messagesgenerated and communicated by the network nodes 106, 109, and 112 aswell as monitoring, repairing or otherwise managing computer network100.

The network management station 103 preferably operates one or morenetwork management applications in order to maximize the uptime,effectiveness, utilization and other operational characteristics of thecomputer network 100. In a network consisting of distributed networknodes, such as computer network 100, such network managementapplications are typically employed on a central network managementstation to provide a single point of network management for networkadministration.

Network management applications are typically operable to monitor,configure, test or otherwise manage generally all aspects of theirassociated computer networks and the computing components coupled tothose networks. For example, a network management application may beconfigured to detect the addition of network nodes to the computernetwork. Further, the network management application may be able todetect the availability of individual network nodes or other devicescoupled to the computer network. Preferably, the network managementapplication is able to address event messages, distributed ornode-specific, generated by the network nodes as well as to performother network management functions and operations. A network managementapplication may include a single software application or a plurality ofsoftware applications cooperating to achieve various network managementfunctions.

The communication network 115 may include such network configurations asa local area network (“LAN”), wide area network (“WAN”), metropolitanarea network (“MAN”), storage area network (“SAN”), their substantialequivalents or any combination of these and/or other networkconfigurations. In addition, the communication network 115 may usephysical or wireline communication protocols and media including, butnot limited to, metal wires and cables made of copper or aluminum,fiber-optic lines, and cables constructed of other metals or compositematerials satisfactory for carrying electromagnetic signals, electricalpower lines, electrical power distribution systems, building electricalwiring, conventional telephone lines, coaxial cable, Ethernet, GigabitEthernet, Token Ring and Fibre Channel. Further, the communicationnetwork 115 may also use wireless communication schemes including, butnot limited to, Bluetooth, IEEE 802.11b, infra-red, laser and radiofrequency, including the 800 MHz, 900 MHz, 1.9 GHz and 2.4 GHz bands, inaddition to or in lieu of one or more wireline communication schemes.

As illustrated in FIG. 1, a plurality of network nodes 106, 109 and 112are preferably communicatively coupled to the communication network 115.The network nodes 106, 109 and 112 may be implemented using a variety ofcomputing components. In general, each of the network nodes 106, 109 and112 preferably includes at least one processor, and a memory andcommunication interface operably coupled to the processor (not expresslyshown). Examples of network node devices suitable for use with thepresent invention include, but are not limited to, servers, mainframes,laptops, switches, routers, bridges, hubs, application blades or thelike. The network nodes 106, 109 and 112 in a given computer network mayinclude like devices or a variety of different devices.

In one embodiment of the present invention, the network nodes 106, 109and 112 are preferably application blades, where an application blademay be defined as any electronic device that is able to perform one ormore functions. For example, an application blade may be a peripheralcard that is connected to a server or other device that is coupled to aswitch. Other examples of application blades include, but are notlimited to: remote computing devices communicatively coupled to thecommunication network 115 by a network connection; software processesrunning virtually on a single or multiprocessing system and/or single ormultithreading processor; electronic appliances with specificfunctionality; or the like.

In a typical computer network configuration, the network nodes coupledthereto generally report node-specific events, i.e., events generallyaffecting only the reporting node, and distributed events, i.e., eventsgenerally affecting the whole of the computer network, as they aredetected, observed or otherwise become known to a network node. Arrows124, 127, and 130 indicate generally the reporting of all eventsdetected by the network nodes 106, 109, and 112, i.e., the reporting ofboth node-specific events and distributed events. As a result, repeatedmessages regarding the same distributed event are often reported to thenetwork management station by a plurality if not all of the reportingenabled network nodes. The methods of the present invention reduce oreliminate the potential redundant handling of repeated distributed eventmessages by recognizing a proxy node from the plurality of network nodesthat is responsible for reporting distributed events.

As shown in FIG. 1, the network node 106 may be designated as the proxynode for the computer network 100. In general operation, when thenetwork management station 103 receives a distributed event message fromthe communication network 115, it preferably interrogates or otherwiseidentifies the source of the distributed event message to determinewhether the distributed event message was originated or sent by theproxy node or network node 106 as illustrated in FIG. 1. If the networkmanagement station 103 determines that the distributed event messagereceived was generated by or originated from the proxy node 106, thenthe network management station 103 preferably handles, processes orotherwise addresses the substance of the event, e.g., removal of adevice port's entry from an associated Distributed Name Server.Alternatively, if the network management station 103 determines that thedistributed event message was sent by a non-proxy node, such as networknode 109 and/or 112, then the network management station 103 preferablyfurther interrogates the distributed event message to determine whetherthe distributed event need be addressed by the network managementstation 103 or if the distributed event message can be discarded,delegated or otherwise unprocessed. All node-specific event messagesfrom all of the reporting network nodes 106, 109, and 112 are preferablyhandled, processed or otherwise addressed by the network managementstation 103. Additional detail regarding the operational aspects of thepresent invention will be discussed below with reference to FIG. 2.

Referring now to FIG. 2, a flow diagram illustrating a networkmanagement station based method of reducing or eliminating the repeatedhandling of distributed network event messages is shown, according toteachings of the present invention. Prior to the initiation of method200, the network management station 103 preferably selects or designatesone of its associated network nodes 106, 109, and 112 to initially serveas the proxy node. The initial selection of a proxy node may be made atrandom, according to a network address or according to a wide variety ofother proxy node selection methods. Once a proxy node has been selected,the network management station 103 notes or stores the proxy node'sidentity, e.g., its network address, network interface card identifier,etc., for later comparison.

Method 200 preferably begins at step 203 where the network managementstation 103 is in receipt of an event message from the communicationnetwork 115. Upon receipt of the event message, method 200 preferablyproceeds to step 206 where the event message may be evaluated todetermine whether it contains a node-specific event or a distributedevent.

If at step 206 the network management station 103 determines that thereceived event message contains a node-specific event, method 200preferably proceeds to step 209. At step 209, the network managementstation 103 preferably addresses the node-specific event according toone or more network management settings. For example, if thenode-specific event indicates that a cooling fan has failed at thenetwork node reporting the node-specific event, the network managementstation 103 may generate an electronic message notifying a technicianthat the fan at the reporting or source network node needs maintenance.Alternatively, if the node-specific event indicates that an applicationon the reporting node is corrupt, or otherwise in need of repair, thenetwork management station 103 may initiate a reinstall or softwaredownload and update routine to repair the corrupt application. Othermethods of addressing, processing or otherwise handling node-specificevents are contemplated within the spirit and scope of the presentinvention.

Once the network management station 103 has addressed, or initiated aresponse to the node-specific event, method 200 preferably proceeds tostep 212 where the network management station 103 may await receipt ofthe next event message before returning to 203. In a further embodiment,the network management station 103 may verify that addressed orprocessed events were actually corrected and reinitiate processing inthe instance the event persists.

If at step 206 the review of the received event message indicates thatthe event message pertains to a distributed event, method 200 preferablyproceeds to step 215. At step 215 the network management station 103 mayidentify the address of origination for the distributed event message orotherwise identify the node from which the distributed event message wasreceived or from which network node the distributed event messageoriginated.

A variety of methods may be employed to identify the network node fromwhich the distributed event message originated. Such methods include,but are not limited to, parsing a header included with the event messageto obtain the network/Internet Protocol/Fibre Channel addressor otherunique identifier of the sending network node. Another method oforiginating node identification may include parsing the distributedevent message or a header associated with the distributed event messageto locate and/or identify a unique identifier associated with thesending or originating node's network communication device such as anetwork interface card. Additional methods of identifying the source ororigination of a received distributed event message are contemplatedwithin the spirit and scope of the present invention.

Once the network management station 103 has obtained the informationpreferred to determine the originator or sender of the distributed eventmessage, method 200 preferably proceeds to step 218. At step 218, thenetwork management station 103 preferably determines whether the networknode which sent the distributed event message is the proxy node 106 forthe computer network 100 or whether the distributed event messageoriginated from a non-proxy node 109 and/or 112. To determine whetherthe proxy node 106 or a non-proxy node 109 and/or 112 originated or sentthe distributed event message, the network management station 103 maycompare the sending address, the address of origination, or other uniqueidentifier obtained from the distributed event message with storedinformation identifying the computer network's 100 proxy node selection.Alternative methods such as eliminating the non-proxy nodes 109 and/or112 as being the sender of the distributed event message may also beemployed.

If the network management station 103 determines that the distributedevent was originated or sent by the proxy node 106, method 200preferably proceeds to step 221. At step 221, the network managementstation 103 preferably initiates one or more routines to resolve theissue reported in the distributed event. As mentioned above, manynetwork management station 103 or network management settings may beconfigured and used to address or process the various sorts ofdistributed events that may occur in the computer network 100. Forexample, a technician may be notified of repairs needed via anelectronic communication generated by the network management station 103or the network management station 103 may initiate a software routinedirected at resolving the issue reported in the distributed event.Alternative methods of network management station 103 settings aimed atresolving distributed events are contemplated within the spirit andscope of the present invention.

Once the network management station 103 has addressed the content of adistributed event, method 200 preferably proceeds to step 212 where thenetwork management station 103 may await receipt of the next eventmessage before returning to step 203.

If at step 218 the network management station 103 determines that thedistributed event message received originated or was sent by a non-proxynode 109 and/or 112, method 200 preferably proceeds to step 224. At step224, the network management station 103 may access or otherwise evaluatethe contents of the distributed event message to determine the issuebeing reported by the distributed event message. In a preferredembodiment, the network management station 103 preferably interrogatesthe distributed event messages received from a non-proxy node 109 and/or112 to determine if the distributed event issue indicates a problem or achange associated with the proxy node 106. For example, the networkmanagement station 103 may wish to determine if the distributed eventmessage received from a non-proxy node 109 and/or 112 indicates that theproxy node 106 has been removed from the communication network 115, thatthe proxy node's 106 identifier, e.g., network address, has changed, orthat the proxy node 106 is otherwise unavailable.

Once the network management station 103 has accessed or interrogated thecontents of the distributed event message originated or sent by anon-proxy node 109 and/or 112, method 200 preferably proceeds to step227. At step 227, the network management station 103 may determinewhether the distributed event message originated by a non-proxy node 109and/or 112 indicates that the proxy node 106 has been removed or hiddenfrom the network. If the distributed event message received from thenon-proxy node 109 and/or 112 indicates that the proxy node 106 has beenremoved from the communication network 115, method 200 preferablyproceeds to step 230.

At step 230, the network management station 103 preferably reassignsproxy status from the unavailable proxy node to the non-proxy node 109and/or 112 that sent or originated the distributed event message beingprocessed. For example, if the distributed event indicating that theproxy node 106 has been removed from the communication network 115originated or was sent by non-proxy node 109, the network managementstation 103 may select or designate non-proxy node 109 as the new proxynode for the other non-proxy nodes in the computer network 100. Beforereassigning proxy status, network management station 103 may beconfigured to execute one or more attempts to bring an unavailable proxynode back on line or to otherwise make an unavailable proxy nodeavailable again.

In an alternate implementation of method 200, the network managementstation 103 may initiate a routine to designate a non-proxy node 109and/or 112 that will replace an unavailable proxy node. Other methodsand implementations of designating a replacement proxy node arecontemplated within the spirit and scope of the present invention.

Once the network management station 103 has addressed the removed orunavailable proxy node issue at step 230, method 200 preferably proceedsto step 212 where the network management station 103 may await receiptof the next event message before returning to step 203.

If at step 227 the network management station 103 determines that thecontents of the distributed event message were originated or sent by anon-proxy node 109 and/or 112 indicates a problem other than the removalof the proxy node 106 from the communication network 115, method 200preferably proceeds to step 233. At step 233, the network managementstation 103 preferably further evaluates the contents of the distributedevent message to determine if the distributed event message indicatesthat the address of the proxy node 106 has been altered or otherwisechanged.

In one example, the address of the proxy node 106 may be defined as aunique identifier for the proxy node 106 used by the network managementstation 103. Examples of such unique identifiers include, but are notlimited to, the host/Internet Protocol (“IP”) address in an IP networkor the Fibre Channel address of the proxy node 106 in a Fibre Channelnetwork. Thus, if the IP address of the proxy node 106 is used by thenetwork management station 103 and a distributed event message from anon-proxy node 109 and/or 112 informs the network management station 103that the proxy node's 106 IP address has changed, then the networkmanagement station 103 may update its proxy node 106 address with thenew value.

If the distributed event message originated or sent by a non-proxy node109 and/or 112 indicates that the address of the proxy node 106 has beenchanged, method 200 preferably proceeds to step 236. At step 236, thenetwork management station 103 preferably updates a stored address forthe proxy node 106 with the address reported in the distributed eventmessage originated by a non-proxy node 109 and/or 112. Alternativeimplementations of updating the network address of the proxy node 106,include, but are not limited to, the network management station 103, onits own, verifying or otherwise obtaining the new network address forthe proxy node 106 are contemplated within the spirit and scope of thepresent invention.

Once the network management station 103 has addressed the non-proxy node109 and/or 112 originated distributed event message at step 236, method200 preferably proceeds to step 212 where the network management station103 may await receipt of the next event message before returning to step203.

If at step 233 the network management station 103 determines that thedistributed event message originated or sent by a non-proxy node 109and/or 112 does not indicate that the address of the proxy node 106 hasbeen changed, method 200 preferably proceeds to step 239. At step 239the distributed event message originated or sent by a non-proxy node 109and/or 112 may be discarded by the network management station 103.Method 200 may be modified such that distributed event messagesoriginated or sent by a non-proxy node 109 and/or 112 are discarded onlyafter the network management station 103 determines that the distributedevent messages have or will have no effect on the proxy node 106 if notaddressed. The network management station 103 may also be configured todelegate distributed event handling where the distributed event messagedoes not affect the proxy node 106.

Once the network management station 103 has addressed the distributedevent message originated or sent by a non-proxy node at step 239, method200 preferably proceeds to step 212 where the network management station103 may await receipt of the next event message before returning to step203.

As described, method 200 provides numerous advantages over existingdistributed event handling methods. One such advantage is that method200 does not require the network nodes 106, 109 and 112 to take part inthe proxy node selection process; for computer networks composed ofheterogeneous network nodes, such network node participation may beimpractical to implement. An additional advantage of method 200 is that,compared to those network management systems that only monitor andhandle events from an individual node in the network, method 200 willnot miss distributed events even if the network node they are monitoringis not the proxy node.

Illustrated in FIG. 3 is an exemplary embodiment of a computer network,similar to computer network 100, incorporating teachings of the presentinvention. Among the similarities between computer network 100,illustrated in FIG. 1, and computer network 300, are network managementstation 103, network nodes, 106, 109 and 112 and communication network115. Other similarities, such as the monitor 118 and the centralprocessing unit 121 of the network management station 103, are alsopresent.

Computer network 300 differs, however, from the computer network 100illustrated in FIG. 1 in its implementation of distributed eventhandling reduction. Specifically, the computer network 300 illustratedin FIG. 3 preferably implements method 400, illustrated in FIG. 4, toreduce distributed event reporting to the network management station103.

In general, method 400, as described in greater detail below, preferablyenables the network nodes 106, 109 and 112 to select by and amongthemselves a proxy node, e.g., network node 106, as indicated generallyat arrows 303, 306 and 312. Upon doing so, the proxy node 106 ispreferably enabled to report both distributed events and node-specificevents to the network management station 103, as indicated generally byarrow 124. Network or non-proxy nodes 109 and/or 112, on the other hand,are preferably configured to report only node-specific events, asindicated generally by arrows 303 and 306 so long as the proxy node 106remains on line or available. As a result, the proxy node 106 is theprimary network node responsible for reporting distributed events to thenetwork management station 103 while it is available. Such aconfiguration reduces network traffic and the double-handling ofdistributed events as reported by other methods. As will be discussed ingreater detail below, should the proxy node 106 become unavailable, thenon-proxy nodes 109 and/or 112 will preferably select a new proxy nodeand continue operation of the computer network 300 according to method400.

The network distributed event handling method of FIG. 4 generallyfunctions by placing the intelligence of avoiding double-handling intothe network nodes 106, 109 and 112. In method 400 generally, a networknode 106, 109 and 112 is agreed upon as the proxy node by all thenetwork nodes participating in proxy node selection or available on thecommunication network 115. Instead of the network management station 103designating a proxy node, all the network nodes 106, 109 and 112 areparticipants in the proxy node selection process. Once a proxy node hasbeen selected, only the proxy node will report distributed events. Boththe proxy node and the non-proxy nodes, on the other hand, arepreferably configured to report, message or send out node-specificevents to the network management station 103.

Depicted in FIG. 4 is a method for selecting a proxy node from aplurality of network nodes and where the proxy node selection is madeentirely by the network nodes themselves, according to teachings of thepresent invention. For a newly established, upstart or otherwise initialrun of a computer network, method 400 begins generally at step 403.Method 400 may be modified to operate on a newly configured computernetwork or an existing computer network such that method 400 maintainsoperation of the computer network according to teachings of the presentinvention.

Upon initiation of the network at step 403, method 400 preferablyproceeds to step 406. At step 406, assuming for purposes of descriptionthat method 400 is being implemented on a newly established computernetwork, a proxy node selection message is preferably sent by andbetween all of the network nodes 106, 109 and 112 in the computernetwork 300. The proxy node selection message transmission may beinitiated and sent by a selected one of the network nodes 106, 109 and112 by design, e.g., a network administrator may designate a node toinitiate proxy node selection, or the first node to detect an event tobe reported may be the node to initiate proxy node selection. Preferablyincluded in each proxy node selection message are both informationpertaining to the source node of the proxy node selection message andthe existing proxy node selection known to be the source node, if any isso known.

Once a proxy node selection message has been transmitted by and betweeneach of the network nodes 106, 109 and 112 participating in the proxynode selection process, the network nodes 106, 109 and 112 may begin theproxy node selection process at step 409. The embodiment of method 400illustrated in FIG. 4 assumes that new network nodes do not have anexisting proxy node selection available to them. However, using theteachings below regarding network nodes having an existing proxy nodeselection available to them, method 400 may be altered or modified suchthat a newly established network utilizes the existing proxy nodeselections available.

At 409, a proxy node selection is made by the new network nodes using anagreed upon selection rule. For example, the agreed upon selection rulemay be derived from the physical layout of the network nodes and theirassociated communication network. Alternatively, the agreed uponselection rule may select a proxy node based on the IP address of thenodes, a Fibre Channel network address or on time stamps associated withthe exchanged proxy node selection messages. The proxy node selectionrule may be established at the deployment of method 400 by a networkadministrator, for example. Additional proxy node selection rules arecontemplated within the spirit and scope of the present invention.

Once a current or initial proxy node for the computer network has beenselected according to the agreed upon proxy node selection rule at step409, method 400 preferably proceeds to step 412. At step 412, eventdetection and generation or monitoring may be initiated in the currentproxy node 106 and the non-proxy nodes 109 and/or 112, i.e., thosenetwork nodes not currently designated as the proxy node.

According to teachings of the present invention and method 400, thecurrent proxy node 106 and non-proxy nodes 109 and/or 112 are preferablyconfigured for event detection, generation and monitoring differently.Specifically, the current proxy node 106 is preferably configured todetect and report both distributed events and node-specific events. Thecurrent or initial non-proxy nodes 109 and/or 112 are preferablyconfigured only to detect and report node-specific events, so long asthe current or initial proxy node 106 remains available. At this point,in any event, the computer network 300 is preferably available for useand each non-proxy node 109 and/or 112 is preferably monitoring itselffor node-specific events and the proxy node 106 is preferably bothmonitoring itself for node-specific events and monitoring the computernetwork 300 for distributed events.

From step 412, method 400 preferably proceeds to step 415 which ispreferably a wait state for the network management station 103. At step415, the computer network 300 is preferably operating as desired,transferring communications as requested, etc. In addition, the computernetwork 300 is preferably being monitored for the addition of new nodes.Monitoring and notification of the presence of new nodes may beaccomplished using a variety of methods. For example, as a new node isadded to the computer network 300, the new node may be configured totransmit a signal to the existing nodes on the network that it has beenadded. Alternatively, the network management station 103 may beconfigured to periodically poll the computer network 300 to detect thepresence of new nodes, detect missing nodes as well as to accomplishother network management goals. In yet another example, the currentproxy node 106 or one of the current non-proxy nodes 109 and/or 112 maybe configured to monitor the computer network 300 for the addition ofnew network nodes.

In addition to monitoring the computer network 300 for new network nodesat step 415, method 400 is preferably also monitoring the availabilityof the current proxy node 106. According to teachings of the presentinvention, in the event the current proxy node 106 becomes unavailable,method 400 preferably initiates a new proxy node selection processgenerally as described below.

Monitoring the availability of the current proxy node 106 may beaccomplished using a variety of processes. For example, once the currentproxy node 106 has been selected, in addition to configuring the currentproxy node 106 to report both distributed events and node-specificevents, the current proxy node 106 may be configured such that itprovides a heartbeat signal to the non-proxy nodes 109 and/or 112. Insuch an implementation, when one of the non-proxy nodes 109 and/or 112ceases to receive the heartbeat signal from the current proxy node 106,the non-proxy node 109 and/or 112 may verify the unavailability of theproxy node 106 and/or initiate the selection process for a replacementproxy node. In an alternate implementation, one or more of the non-proxynodes 109 and/or 112 may be configured to periodically verify that thecurrent proxy node 106 is available. In the event a non-proxy node 109and/or 112 is unable to communicate with the current proxy node 106 orotherwise determines that the current proxy node 106 is unavailable, theprocess of selecting a replacement or new proxy node may be initiated bydiscovering non-proxy node 109 and/or 112.

In the event a new network node is added to or detected on the computernetwork 300 or in the event the current proxy node 106 has beendetermined to be unavailable, method 400 preferably proceeds to step418. At step 418, event generation in the computer network 300, i.e., inthe network nodes 106, 109 and 112, is preferably stopped or paused.Once event generation has been stopped or paused, method 400 preferablyproceeds to step 421. At step 421, the process of selecting a new orreplacement proxy node may be initiated. The proxy node selectionprocess may vary slightly at step 421 depending on whether the proxynode selection process was initiated in response to the addition of anew node to the computer network 300 or in response to theunavailability of the current proxy node 106.

In response to the addition of a new node to the computer network 300, aproxy node selection message is preferably sent to each new node addedto the network at step 421. In an exemplary embodiment of the presentinvention, the current proxy node 106 may be responsible for initiatingthe exchange of proxy node selection messages with the new networknodes. In the event that both new nodes have been added to the computernetwork 300 and the current proxy node 106 is unavailable, one of thenon-proxy nodes 109 and/or 112 may be responsible for sending out theproxy node selection message to the new nodes and the remainingavailable nodes. Alternatively, in such an event, the network managementstation 103 may be responsible for initiating the proxy selectionprocess, the remaining steps of the proxy selection process preferablybeing performed by the available network nodes, without additional inputor support from the network management station 103.

Alternatively, if proxy node selection messages are being sent inresponse to the unavailability of the current proxy node 106, the proxynode selection messages are preferably exchanged by and between all ofthe non-proxy nodes 109 and/or 112 and/or all of the network nodesavailable on the communications network 115 at step 421. In an exemplaryembodiment, the non-proxy node 109 and/or 112 detecting and/ordetermining the unavailability of the current proxy node 106 may beresponsible for initiating the exchange of proxy node selection messagesbetween the appropriate non-proxy and network nodes. In addition, insuch an event, the non-proxy node 109 and/or 112 initiating a new proxynode selection process may indicate the unavailability of the currentproxy node 106 to the remaining non-proxy nodes 109 and/or 112 such thateach may release their current proxy node selection setting. Alternativeimplementations of proxy node selection message initiation andgeneration are contemplated within the spirit and scope of the presentinvention.

Once the proxy node selection messages have been exchanged by andbetween the appropriate network nodes, both new and non-proxy, method400 preferably proceeds to step 424. At step 424, the available networknodes preferably wait for the proxy node selection messages from each ofthe other nodes participating in the proxy node selection process. Forexample, in the newly added network node scenario described above, thecurrent proxy node 106 and the existing non-proxy nodes 109 and/or 112will preferably wait for return proxy node selection messages from eachof the newly added network nodes. Alternatively, if the current proxynode 106 is managing the proxy node selection process with the newnetwork nodes, the current proxy node 106 may remain in wait at step424.

Upon receipt of each return proxy node selection message, method 400preferably proceeds to step 427 where a check is made to determine ifthe returning proxy node selection messages have been received from allof the nodes participating in the proxy node selection process, forexample, from all of the new network nodes. If it is determined thatthere are nodes from which a return proxy node selection message has notbeen received, method 400 preferably returns to step 424 where theremaining return proxy node selection messages may be awaited. If it isdetermined that all of the nodes participating in the proxy nodeselection process have returned a proxy node selection message, method400 preferably proceeds to step 430. Alternatively, if method 400 hasreturned to step 424 to await additional return proxy node selectionmessages but no additional return proxy node selection messages arereceived within some defined time window, method 400 may proceed to step430.

At step 430, a determination is made as to whether any of the returnproxy node selection messages contain an existing proxy node selection.As mentioned above, the proxy node selection messages preferably includeboth information as to the source of the proxy node selection messageand information as to the existing proxy node selection known to thesource node, if any. For example, if proxy node selection was initiatedin response to the addition of nodes to the computer network 300, eachof the existing non-proxy nodes 109 and/or 112 and the proxy node 106already on the computer network 300 should each indicate an existingproxy node selection, i.e., the current proxy node 106. Alternatively,if the proxy node selection process was initiated in response to theunavailability of the current proxy node 106, each of the return proxynode selection messages from the non-proxy nodes 109 and/or 112participating in the new proxy node selection process may not have anexisting proxy selection, e.g., each non-proxy node 109 and/or 112 mayhave released its existing proxy node selection setting in response tothe knowledge that the current proxy node 106 has become unavailable.

If at step 430 it is determined that there are no existing proxyselections in the return proxy node selection messages, method 400preferably proceeds to step 433. At step 433, a new proxy node may beselected from the nodes available on the computer network 300 accordingto a selection rule agreed upon by the nodes. Examples for such a ruleinclude, but are not limited to, an Internet Protocol address basedrule, a Fibre Channel node World Wide Name based rule and an earliesttime stamp based rule using the timestamps preferably included in theproxy node selection messages, as mentioned above.

Upon selection of a new or replacement proxy node by agreed upon rule atstep 433, method 400 preferably proceeds to step 436 where eventgeneration may be restarted according to the new arrangement ofnon-proxy nodes and the newly selected proxy node. For example, the newproxy node may be configured to monitor and report both distributed andnode-specific events and to monitor the network for new nodes while thenon-proxy nodes may be configured to report only node-specific eventsand to monitor the availability of the new proxy node. From step 436,method 400 preferably returns to step 415 where the addition of nodes tothe network and the unavailability of the new proxy node are preferablymonitored and awaited by the network management station 103.

If at step 430 it is determined that one of the return proxy nodeselection messages contains an existing proxy selection, method 400preferably proceeds to step 439. At step 439, each of the nodes or amanaging node, e.g., the current proxy node 106 or the non-proxy node109 and/or 112 detecting the unavailability of the current proxy node106, in receipt of return proxy selection messages preferably determineswhether the existing proxy node selections indicated in the return proxynode selection messages received from the other nodes are in conflictwith or match one another. If it is determined that there is a conflictor that the existing proxy node selections do not match, method 400preferably proceeds to step 433 where the participating network nodesuse an agreed upon rule for selecting a new proxy node generally asdescribed above. Alternatively, if a single network node is evaluatingwhether there is a conflict among the existing proxy node selections,that network node may generate a message indicating such a conflict tothe remaining participating network nodes and the need to proceed tostep 433 for selection of a new proxy node by agreed upon rule. If it isdetermined that there are no conflicts or that the existing proxy nodeselections indicated in the return proxy selection messages match oneanother, method 400 preferably proceeds to step 442.

At step 442, a determination is made whether the proxy node selectionsubmitted by and matching amongst the other participating network nodesmatches with the evaluating or current network node's own existing proxynode selection, e.g., the managing node or each node in receipt of areturn proxy node selection message. If the current node determines thatthe proxy node selection submitted matches its own proxy node selection,method 400 preferably proceeds to step 436 where event generation andreporting may be re-initiated generally as described above.Alternatively, if at step 442 the evaluating network node determinesthat it either does not have an existing proxy node selection or thatits existing proxy node selection does not match or conflicts with theexisting proxy node selection submitted by the remaining network nodes,method 400 preferably proceeds to step 445. At step 445, the currentnetwork node adopts the existing proxy node selection submitted by theremaining network nodes such that all participating network nodes nowrecognize the same new proxy node for the computer network 300. Fromstep 445, method 400 preferably proceeds to step 436 where eventgeneration, a described above, may be initiated.

Method 400 provides numerous advantages over existing distributed eventhandling methods. One advantage of method 400 is that method 400 doesnot require the involvement of the network management station 103 forpurposes other than processing node-specific and distributed events,i.e., the network management station 103 is not needed for proxyselection or for ensuring proxy availability, thus the resources of thenetwork management station 103 may be reserved for event handling andother significant network management processing. In addition, method 400reduces network traffic by preferably sending the network managementstation 103 only one copy of each distributed event.

As described herein, methods 200 and 400 provide clear advantages overthe existing distributed event handling solutions. One such advantage isthe elimination or reduction in double-handling when the networkmanagement station 103 receives multiple copies of the same event. Assuch, the methods described herein reduce the processing resourcesassociated with double-handling, thereby freeing such resources forother processing or network management applications.

The invention, therefore, is well adapted to carry out the objects andto attain the ends and advantages mentioned, as well as others inherenttherein. While the invention has been depicted, described, and isdefined by reference to exemplary embodiments of the invention, suchreferences do not imply a limitation on the invention, and no suchlimitation is to be inferred. The invention is capable of considerablemodification, alternation, and equivalents in form and function, as willoccur to those ordinarily skilled in the pertinent arts and having thebenefit of this disclosure. The depicted and described exemplaryembodiments of the invention are exemplary only, and are not exhaustiveof the scope of the invention. Consequently, it is intended that theinvention be limited only by the spirit and scope of the appendedclaims, giving full cognizance to equivalents in all respects.

1-26. (canceled)
 27. A computer network comprising: a plurality ofnetwork nodes, including a proxy node and a plurality of non-proxynodes, the network nodes operably coupled to a communication network;the plurality of network nodes operable to cooperatively select theproxy node from the plurality of network nodes; the proxy node operableto detect and report distributed and node-specific events to a networkmanagement station via the communication network; and at least onenon-proxy node operable to detect and report only node-specific eventsto the network management station while the proxy node remainsavailable.
 28. The computer network of claim 27 further comprising atleast one non-proxy node operable to monitor availability of the proxynode.
 29. The computer network of claim 28 further comprising at leastone non-proxy node operable to initiate and participate in selection ofa new proxy node with the non-proxy nodes in response to a lapse in theavailability of the proxy node.
 30. The computer network of claim 27further comprising at least one network node operable to detect a newnode added to the communication network.
 31. The computer network ofclaim 30 further comprising at least one network node operable toinitiate selection of a proxy node in response to detection of a newnode.
 32. The computer network of claim 27 further comprising eachnetwork node participating in proxy node selection operable to exchangean existing proxy node selection message with one another, each existingproxy node selection messages identifying a proxy node known to eachexchanging network node.
 33. The computer network of claim 32 furthercomprising at least one network node operable to apply an agreed uponrule for proxy node selection in response to a conflict between theproxy nodes identified in the exchanged proxy node selection messages.34. The computer network of claim 32 further comprising at least onenetwork node participating in selection of the proxy node operable todetect a conflict between the proxy nodes identified in the existingproxy node selection messages.
 35. The computer network of claim 34further comprising the at least one network node operable to select theproxy node identified in the existing proxy node selection messages ifno conflict is detected.
 36. A network computing device comprising: atleast one processor; memory operably coupled to the processor; acommunication interface operably coupled to the processor and thememory, the communication interface operable to communicate with atleast one network node and a network management station via acommunication network; and a program of instructions storable in thememory and executable by the processor, the program of instructionsoperable to cooperate with at least one network node to select a proxynode and further operable to report events to the network managementstation according to selection of the network computing device as aproxy node or a non-proxy node.
 37. The network computing device ofclaim 36 further comprising the program of instructions operable toreport both distributed and node-specific events to the networkmanagement station in response to selection of the network computingdevice as the proxy node.
 38. The network computing device of claim 36further comprising the program of instructions operable to reportnode-specific events to the network management station while a selectedproxy node is available.
 39. The network computing device of claim 36further comprising the program of instructions operable to exchangeexisting proxy node selections with the at least one network node and todetect a conflict between the existing proxy node selections.
 40. Thenetwork computing device of claim 39 further comprising the program ofinstructions operable to select the proxy node according to the existingproxy node selections if there is no conflict detected and to select theproxy node according to one or more rules in response to detection of aconflict between the existing proxy node selections.
 41. The networkcomputing device of claim 36 further comprising the program ofinstructions operable to monitor availability of the proxy node and toinitiate selection of a new proxy node in response to a lapse in proxynode availability.
 42. The network computing device of claim 36 furthercomprising the program of instructions operable to initiate proxy nodeselection in response to detection of a new network node.